Syllable Based Audio Search Using Confusion Network Arc as Indexing Unit
نویسندگان
چکیده
Compared to English, Chinese has a simpler and more restricted syllabic structure. In order to exploit the special characteristics of Chinese, syllable is selected as the unit for ASR lattice representation. For the sake of fast retrieval, syllable lattices are clustered into confusion network linear lattices, and then encoded into inverted index. To recover the posterior probabilities of pruned word hypotheses in confusion network, syllable confusion matrix is used to calculate relevance score of a given keyword. Experiments on the corpora for the keyword spotting task in the 2005 HTRDP ASR Evaluation show that the proposed approach not only yields a compact inverted index and supports quick keyword query, but also achieves an EER of 46.75%.
منابع مشابه
A fast fuzzy keyword spotting algorithm based on syllable confusion network
This paper presents a fast fuzzy search algorithm to extract keyword candidates from syllable confusion networks (SCNs) in Mandarin spontaneous speech. Since the recognition accuracy of spontaneous speech is quite poor, syllable confusion matrix (SCM) is applied to compensate for the recognition errors and to improve recall. For fast retrieval, an efficient vocabulary-independent index structur...
متن کاملUsing syllable-based indexing features and language models to improve German spoken document retrieval
Spoken document collections with high word-type/word-token ratios and heterogeneous audio continue to constitute a challenge for information retrieval. The experimental results reported in this paper demonstrate that syllable-based indexing features can outperform word-based indexing features on such a domain, and that syllable-based speech recognition language models can successfully be used t...
متن کاملWord and Sub-word Indexing Approaches for Reducing the Effects of OOV Queries on Spoken Audio
We explore the problem of out of vocabulary (OOV) queries in audio indexing systems by comparing three indexing methods on a broadcast news repository containing 75 hours of audio. Our systems are word-based, phoneme-based and a novel system based on syllable-like units called particles. To better examine the performance of these three approaches we use a query set where the percentage of OOVs ...
متن کاملAutomatic syllable-based phoneme recognition using ESTER Corpus
This paper presents an evaluation of speaker-independent continuous phoneme recognition systems on the French speech database ESTER. The tested systems are syllable-based phoneme recognizers, i.e. they use syllables as basic units together with syllabic bigram language models and HMM topologies adapted to syllables. Once identified, syllables are converted back to phones. In a previous paper, w...
متن کاملFast vocabulary-independent audio search using path-based graph indexing
Classical audio retrieval techniques consist in transcribing audio documents using a large vocabulary speech recognition system and indexing the resulting transcripts. However, queries that are not part of the recognizer’s vocabulary or have a large probability of getting misrecognized can significantly impair the performance of the retrieval system. Instead, we propose a fast vocabulary indepe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006